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TUTORIAL 


RISK-BASED  DECISION 
SUPPORT  TECHNIQUES  FOR 
PROGRAMS  AND  PROJEQS 


Barney  Roberts,  Clayton  Smith,  and  David  Frost 

This  article  is  designed  for  the  project  management  professional  who  intends 
to  make  risk-based  decision  making  a  fundamental,  integrating  principle  of  the 
project’s  operating  processes.  It  is  about  making  decisions  using  information 
that  relates  possible  future  outcomes  to  the  risk  inherent  to  decisions  made.  A 
project  manager  needs  to  make  two  types  of  decisions:  those  that  relate  to  the 
business  aspects  of  a  project  and  those  that  relate  to  the  performance  aspects 
of  the  product.  Part  1  details  the  project-focused  tools  and  techniques  and  Part 
2  details  the  product-focused  tools  and  techniques.  Advanced  integrated 
quantitative  techniques  and  tools  that  have  been  proven  to  have  high  utility  to 
decision  makers  are  presented. 


A  project  manager  needs  to  make  two 
types  of  decisions:  those  that  re 
late  to  the  business  aspects  of  a 
project  and  those  that  relate  to  the  perfor¬ 
mance  aspects  of  the  product.  Decisions 
that  are  related  to  the  business  aspects  are 
focused  on  how  much  things  cost  or  might 
cost,  how  long  it  takes,  or  may  take,  to  do 
something.  Business  aspects  of  the  project 
are  fraught  with  risks  in  cost,  schedule, 
fabrication,  testing,  and  production  of  the 
product.  Decisions  that  are  related  to  the 
performance  aspects  of  the  product  are 
focused  on  things  like  reliability,  main¬ 
tainability,  safety,  and  operations.  Perfor¬ 
mance  aspects  of  the  product  are  fraught 
with  risks  in  system  failures,  operational 


failures,  environmental  impacts,  and  overt 
and  covert  external  threats. 

This  paper  is  divided  into  two  parts:  Part 
1  for  the  project-focused  tools  and  tech¬ 
niques  and  Part  2  for  the  product- focused 
tools  and  techniques.  The  following  is  a 
listing  of  those  decision-support  products 
that  we  have  found  to  be  of  greatest  utility 
and  value  to  the  projects. 

1 .  Project-focused  tools  and  techniques 

•  Cumulative  Distribution  Functions 
(CDFs)  for  project  completion  date 

•  CDFs  for  cost  estimate  at  comple¬ 
tion 

•  Double  Pareto  boxes 

•  Stochastic  Critical  Path  Analysis 
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2.  Product-focused  tools  and  techniques 

•  Bandaid  Charts 

•  Fussell-Vessely  Charts 

In  each  description  of  a  tool  or  tech¬ 
nique,  the  following  format  is  used: 

1 .  What  is  it?  A  description  of  the  specific 
product, 

2.  How  does  it  work?  A  brief  overview 
of  the  analytical  technique,  and 

3 .  What  is  its  utility?  A  few  examples  of 
applications  in  decision  making  and 
the  value  derived. 


Part  1:  Project-Focused  Tools  and 
Techniques 


Cumulative  Distribution  Functions 
FOR  Project  Completion 

What  is  it?  A  Cumulative  Distribution 
Function  helps  us  understand  the  uncer¬ 
tainty  or  the  confidence  associated  with 
stochastic  variables.  The  CDF  is  the  math¬ 
ematical  integral  of  the  probability  den¬ 
sity  function  (PDF).  The  PDF  represents 
the  probability  that  different  outcomes  of 
a  random  variable  will  occur.  The  sample 
plot  in  Figure  1  is  a  PDF  and  a  CDF  for 
the  outcome  of  a  pair  of  dice  thrown  a 
large  number  of  times.  The  plot  is  read  as 


Figure  1.  Simple  Example  of  the  Probability  Density  Function  (PDF) 
and  Cumulative  Distribution  Function  (CDF)  Plots 
Showing  the  Outcome  for  a  Pair  of  Dice 
Thrown  a  Large  Number  of  Times 
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follows:  “The  probability  that  an  eight  is 
thrown  is  14  percent  (the  left  hand  verti¬ 
cal  axis)  and  the  probability  that  an  eight 
or  less  is  thrown  is  72  percent  (the  right 
hand  axis).  Conversely,  the  probability  that 
a  number  higher  than  eight  is  thrown  is 
28  percent. 

Utility.  The  impact  of  risks  on  the  ex¬ 
pected  completion  date  can  be  plotted.  The 
effects  of  different  mitigation  plans  can 
be  compared  and  selected  based  on  the 
project’s  propensity  for  risk.  Mitigation 
actions  will,  in  general,  add  cost  to  the 
baseline  project,  thus  a  project  manager 
would  want  to  see  actual  quantitative  in¬ 
formation  on  just  how  much  residual  risk 
would  remain  as  a  function  of  investments 
to  mitigate  the  risks. 

In  the  sample  shown  in  Figure  2,  the 
project  manager  can  meter  the  mitigation 
investments  versus  the  projects  propensity 
for  risk.  For  example,  a  real  “gutsy” 


project  manager  may  accept  the  20  per¬ 
cent  probability  of  completion,  especially 
if  the  costs  for  the  mitigation  options  are 
very  high.  On  the  other  hand,  if  the  costs 
are  moderate  or  acceptable,  or  if  the 
project  manager  is  averse  to  risk,  option 
A,  moderate  mitigation,  may  be  chosen. 
At  the  other  extreme,  a  very  low  invest¬ 
ment  and/or  a  very  risk-averse  project 
manager.  Option  B,  substantial  mitigation, 
may  be  a  better  choice. 

How  are  CDFs  Created?  Here  is 
where  we  get  to  the  value  of  this  paper 
and  these  tools  and  techniques  to  differ¬ 
ent  levels  of  maturity  in  the  project  man¬ 
agement  staff.  This  analysis  cannot  be 
successfully  performed  if  the  project  does 
not  have  an  integrated  master  schedule 
(IMS);  it  doesn’t  have  to  be  perfect,  but 
must  be  a  reasonable  semblance  of  an  ex¬ 
ecutable  plan  that  is  based  on  analogous 
experience. 


Figure  2.  Example  Cumulaliave  Dislribulien  Function  (CDF)  and 
Hew  a  PreiecI  Can  Meter  Its  Investments  to  Mitigate 
These  Risks  Versus  Its  Propensity  for  Risk 


159 


AiquisiHon  Review  Quarterly  —  Spring  2003 


The  technique,  illustrated  in  Figure  3, 
is  to  collect  the  project’s  risks,  as  defined 
within  the  framework  of  any  typical  risk 
management  process,  understand  through 
analogies  or  expert  opinion  the  impact  to 
each  Work  Breakdown  Structure  (WBS) 
line  item  in  the  IMS,  and  then  perform  a 
Monte  Carlo  simulation.  There  exist  sev¬ 
eral  commercial  software  tools  that  can 
perform  this  analysis.  The  CDF  is  an  out¬ 
put  from  any  of  those  tools.  We  have  ex¬ 
perienced  remarkable  accuracy  from  these 
techniques  when  no  mitigation  actions  are 
implemented  and  the  risks  are  accepted 
in  a  set  of  six  predictions  that  we  have 
done  for  a  major  space  program.  Our  pre¬ 
dictions,  performed  18-24  months  in  ad¬ 
vance,  estimated  a  5-6  month  schedule 
slip  due  to  risk.  No  mitigations  were  taken 
hy  the  program  giving  us  the  opportunity 


to  test  the  accuracy  of  the  tools.  The  results 
had  a  mean  error  rate  of  only  20  days  at 
the  50th  percentile. 

An  Actual  Case  Example.  The  ex¬ 
amples  shown  here  are  from  a  NASA 
space  exploration  mission.  In  Figure  4,  the 
spacecraft  could  he  launched  in  either  of 
the  two  narrow  hands  within  a  6-8  week 
window  occurring  about  six  months  apart. 
A  significant  amount  of  flight  design, 
trajectory  analysis,  and  mission  planning 
must  be  performed  to  support  either 
launch  window.  The  planned  launch  was 
at  the  beginning  of  the  first  band,  January 
5,  2001,  and  flight  design  had  begun  to 
support  the  first  launch  window.  However, 
the  risk  analysis  showed  less  than  20  per¬ 
cent  chance  for  project  completion  in  time 
for  the  first  launch  opportunity.  Having  the 
risk  analysis  demonstrate  at  least  an  80 


Figure  3.  The  Cumulative  Dislribulien  Functions  (CDF) 
Are  Generated  from  a  Monte  Carlo  Simulation 
of  the  Proiect's  Integrated  Master  Schedule  (IMS) 
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percent  chance  for  completion  in  the  sec¬ 
ond  hand,  one  month  later,  gave  the  project 
the  confidence  needed  to  switch  the  flight 
design  to  he  consistent  with  the  second 
launch  window  saving  significant  extra 
cost  because  of  the  high  prohahility  that 
the  flight  design  will  need  to  he  repeated. 

The  CDF  for  completion  can  also  he 
plotted  over  time  to  serve  as  a  tracking 
metric  for  the  total  integrated  risk  envi¬ 
ronment  of  the  project.  The  object  is  to 
have  a  visual  display  that  illustrates  a 
decreasing  trend  (or  not)  of  the  risk  envi¬ 
ronment.  One  type  of  plot  that  accom¬ 
plishes  this  is  shown  in  Figure  5.  This  is 
from  the  same  project  that  is  referenced 
above  in  Figure  4,  but  the  analysis  was 
repeated  over  time  and  is  plotted  in  Figure 
5  with  the  80th  and  20th  percentiles  de¬ 
fining  the  ranges  plotted,  and  the  diamond 


symbols  indication  the  50th  percentile. 
Figure  4  correlates  to  the  date  line  of  3/00 
in  Figure  5. 

Read  the  chart  as  follows:  The  upper 
horizontal  band  on  the  plot  is  “Ready 
Early.”  “Ready  On-Time”  is  the  middle 
band  that  also  spans  the  launch  window. 
“Ready  Late”  is  the  lower  band,  which 
means  a  6-month  slip  to  the  next  launch 
window  and  all  associated  costs  that  go 
with  that  slip.  The  upper  line  plotted  is 
the  deterministic  completion  date  (i.e.,  no 
risk)  and  the  lower  line  plotted  with  the 
20th  and  80th  percentile  confidence  bands 
on  the  risk-adjusted  completion  date.  The 
project’s  objective  is  to  continue  to  invest 
in  risk  mitigation  actions  until  the  band 
and  the  area  of  highest  likelihood  is  no 
longer  in  the  “Missed  Launch  Period”  area 
of  the  chart.  Note  the  improving  trend  over 


Opening  Closing 
Opportunity  Opportunity 


cCI 


c3 


§ 


The  Cumulative  Distribution  (CDF)  plots  the 
probability  that  the  spacecraft  (S/C)  is  ready 
for  launch  on  the  date  plotted  or  earlier. 

For  example,  this  plot  says  that  there  is: 

•  Virtually  no  likelihood  of  making 
scheduled  launch  date,  Jan.  5, 2001, 

•  Less  than  20%  chance  of  launching  in 
first  opportunity 

•  83%  chance  of  launching  by  end  of 
second  opportunity 


Figure  4.  The  Risk  Anaiysis  Cieariy  Demonstrated  that  the  Mission 
Design  Sheuid  Be  Moved  to  the  Second  Opportunity 
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Figure  5.  Tracking  the  Cumuialive  Distribution  Function  (CDF) 
Over  Time  Throughout  the  Prelect  Life  Cycie,  Gives 
the  Prelect  Manager  a  Trend  of  the  Risk  Environment 


time  indicating  the  success  of  the  risk 
mitigation  actions  as  well  as  some  “Ac¬ 
cepted”  risks  passing  their  exposure  win¬ 
dow  without  becoming  problems.  One 
should  ask,  “What  happened  that  caused 
the  downward  trend  at  the  end?”  A  costly 
mitigation  plan  had  been  put  in  place  to 
deal  with  a  risky  component.  Seeing  the 
substantial  risk-based  margin  gave  the 
project  manager  the  confidence  needed  to 
abandon  the  mitigation  plan,  save  the 
money,  and  still  meet  the  completion  date. 

Cumulative  Distribution  Functions 
FOR  Cost 


What  are  they?  They  are  simply  no 
more  than  the  same  function  as  the 


schedule  CDF  but  with  cost  as  the  domain. 
These  functions  can  be  generated  in  a  way 
that  is  consistent  with  schedule  risks  if  the 
IMS  is  resource  loaded.  Many  low  matu¬ 
rity  projects  carry  the  cost  data  in  a  sepa¬ 
rate  database  from  schedule  information 
making  it  very  difficult  to  get  a  good  co¬ 
ordinated  cost-schedule  risk  analysis.  As 
long  as  the  project  has  at  least  an  accept¬ 
able  IMS,  one  way  to  circumvent  this  low 
maturity  management  approach  is  to  use 
the  cost  data  and  create  cost  simulators 
that  can  be  loaded  into  almost  any  of  the 
tools  used  for  the  IMS  (such  as  MS 
Project).  When  that  is  done,  the  results 
from  the  analysis  will  be  both  Cost  and 
Schedule  CDFs,  and  they  will  be  consistent, 
which  is  a  very  important  consideration 
in  analysing  risk. 
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Utility.  One  obvious  utility  is  expected 
cost  at  completion  as  a  function  of  risk, 
and  the  plots  will  be  very  similar  to  those 
shown  in  Figure  4,  so  that  feature  will  not 
be  discussed.  However,  they  may  be  used 
to  determine  project  reserves  (see  Figure 
6).  The  project  manager  would  do  this  by 
first  establishing  some  level  of  acceptable 
risk,  or  propensity  for  risk,  if  the  customer 
has  not  specified  it.  If  not  specified  by  the 
customer,  this  is  usually  done  by  a  brain¬ 
storm  session  wherein  the  project  manage¬ 
ment  staff  express  their  opinions.  Some¬ 
times  it  is  set  by  organizational  policy. 
Suppose  that  the  project  manager  was  risk 
averse  and  hence  wanted  to  be  80  percent 
certain  that  the  project’s  risk-based  cost 
at  completion  would  not  exceed  the 
project’s  budget.  The  project  manager 
must  iterate  the  design  and/or  de-scope  the 


risk  profile  until  the  80th  percentile  aligns 
with  the  planned  budget.  Then  the  project 
manager  would  want  to  establish  a  reserve 
that  is  equal  to  the  difference  between  the 
50th  percentile  and  the  80th  percentile  and 
hold  that  amount  as  a  reserve  against  risk. 

The  most  important  use  of  the  cost  CDF 
is  the  analysis  of  the  effectiveness  of  miti¬ 
gation  investments.  One  may  create  the 
cost  CDFs  for  several  options  then  com¬ 
pare  the  investment  to  the  return.  Two 
things  can  happen:  the  curve  can  move  to 
the  left,  reducing  cost;  and  the  slope  can 
increase  or  decrease  indicating  a  change 
in  uncertainty.  Sometimes  an  investment 
can  be  made  that  does  not  reduce  the  ex¬ 
pected  cost  but  may  be  a  desirable  invest¬ 
ment  for  the  project  due  to  a  reduction  in 
uncertainty. 


Figure  6.  An  lllusiralien  In  Using  Risk  and  the  Cumulaliave  Distribution 
Function  (CDF)  for  Estimating  Reserve  Funding 
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Figure  7.  Techniques  for  Decision  Making  on 
Risk  Mitigation  Investments  Illustrate  the  Value 
of  the  Cost  Cumulative  Distrihution  Function  (CDF) 


Figure  7  illustrates  these  uses.  Miti¬ 
gation  option  A  reduces  the  expected 
risk  exposure  hy  $X;  if  the  investment 
to  achieve  this  improvement  is  some  ac- 
ceptahle  fraction  of  $X,  then  the  project 
manager  should  accept  this  option.  Sup¬ 
pose  that  the  project  manager  could  in¬ 
vest  another  $Y  and  the  result  is  option 
B.  The  expected  value  of  the  final  cost 
is  unchanged,  hut  the  reduction  of  un¬ 
certainty  has  value  to  the  project.  One 
possible  judgment  would  he  to  compare 
the  at-risk  cost  reduction  at  the  80th  per¬ 
centile,  and  if  this  value  is  some  mul¬ 
tiple  greater  than  the  investment  to 
achieve  option  B,  then  the  project  man¬ 
ager  ought  to  make  the  additional  in¬ 
vestment. 


Double  Pareto  Boxes 


What  are  they?  The  Double  Pareto 
boxes  are  two-dimensional  arrays,  the 
rows  are  the  WBS  line  items  that  are  im¬ 
pacted  by  risks,  and  the  columns  are  each 
individual  risk.  The  cells  of  the  matrix 
contain  some  attribute  of  the  project  that 
is  important  to  risk-manage,  usually  dol¬ 
lars  or  days.  Any  spreadsheet  software  that 
supports  sorting  is  a  suitable  tool  for  this 
analysis.  Once  the  data  are  extracted  from 
the  Monte  Carlo  network  analysis  de¬ 
scribed  above,  and  loaded  into  the  cells, 
the  sorting  functions  are  used  to  sort  the 
highest  cell  values  into  the  upper  left 
corner.  Then  the  matrix  is  sectioned,  or 
truncated,  at  the  row  (the  WBS  items) 
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where  the  cumulative  summation  of  the 
cell  values  equals  80  percent  and  at  the 
column  (the  risks)  under  the  same  condi¬ 
tion;  hence  “Double  Pareto”  hox.  The 
result  is  the  sectioning  off  of  those  few 
risks  that  are  causing  80  percent  of  the 
problem  and  those  few  WBS  line  items 
that  are  “receiving”  that  80  percent  of  the 
impact. 

How  are  they  created?  The  Monte 
Carlo  IMS  analysis  tool  that  was  used 
above  is  used  to  perform  this  analysis  but 
is  run  for  each  individual  risk,  one  at  a 
time.  The  data  are  used  to  fill  the  cells  of 
the  matrix.  The  cells  can  contain  either 
the  risk  impact  in  days  or  dollars,  or,  in 
fact,  any  resource  or  metric  considered 
to  be  of  value  to  the  project.  We  recom¬ 
mend  that  the  project  hold  a  brainstorm 


session  in  the  early  phases  to  determine 
“What  is  to  be  risk-managed.” 

Utility.  Program  and  project  resources 
are  precious  and  should  not  be  spent  on 
trivial  issues.  In  the  cases  where  we  have 
used  the  Double  Pareto  box,  we  have 
found  a  great  reduction  in  the  number 
of  risks  that  need  to  be  mitigated  and 
tracked  and  the  number  of  WBS  line 
items  that  are  threatened  by  risks.  For 
the  Space  Station,  the  “worry”  risks  were 
reduced  by  an  order  of  magnitude  and 
the  “worried”  WBS  line  items  were  on 
the  order  of  a  dozen.  This  also  provides 
the  program  manager  or  project  manager 
with  a  tool  to  deal  with  “whiners,”  being 
able  to  quickly  weed  them  out  by  check¬ 
ing  the  Double  Pareto  box  to  see  if  they 
made  the  cut-box. 


Impacted  Task 

Title 

Risk  Drivers 

Cumulative 

Contribution 

Baseline 

Risk  10 
Late 

Software 

Risk  27 
Star 
Tracker 

Risk  61 
Sequence 
Timer 

GN  FSW  BUILD  4.0  Delivery  to  ATLO  - 
Science 

19.8 

19.8 

0.0 

0.0 

22.3% 

ALTO  SCHEDULE  MARGIN  -  Denver 
(Used  to  model  ATLO  overrun) 

17.7 

0.0 

3.2 

7.4 

42.2% 

GN  FSW  BUILD  3.0  Delivery  to  4.0 
for  ACS  testing  (MST  3) 

16.7 

16.7 

0.0 

0.0 

61.0% 

Star  Tracker  FLT  Design,  Purchase, 

Recieve,  and  Test 

9.1 

0.0 

9.1 

0.0 

71.2% 

FSW  Phase  5.0  Delivery  to  ATLO  - 
Launch 

9.0 

9.0 

0.0 

0.0 

81.3% 

Total 

88.9 

53.9 

12.3 

11.0 

Cumulative  Contribution 

60.6% 

74.4% 

86.8% 

The  Double  Pareto  box  greatly  focuses  the  project’s  attention  on  the  few  risks  that  cause  80  percent  of  the 
problem  and  the  few  WBS  lines  that  are  receiving  this  80  percent  impact.  The  cells  in  this  graphic  contain 
either  delta-dollars  or  delta-days  due  to  each  risk. 


Figure  8.  The  Double  Pareto  Box 
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An  example  of  an  actual  Double 
Pareto  box  from  a  NASA  space  science 
mission  is  shown  in  Figure  8.  In  this 
example,  the  left-most  column  is  the 
baseline  case  with  all  risks,  following 
that  to  the  right  are  the  single,  driver 
risks.  As  few  as  three  risks,  of  the 
project’s  approximately  34,  constitute 
the  80  percent  and  only  five  tasks,  out 
of  1000,  bear  that  impact.  In  this  case, 
the  project  manager’s  span  of  attention 
was  greatly  reduced. 

Stochastic  Critical  Path  Ahalysis 

What  is  it?  The  Stochastic  Critical  Path 
has  been  the  most  valued  product  that  we 
have  produced  for  the  project  manager. 
It  is  a  specific  portrayal  of  the  schedule 
network  of  a  project  wherein  an  additional 
piece  of  information  is  added.  First  the 


deterministic  critical  path  is  highlighted 
and  then  the  various  stochastic  critical 
paths  are  highlighted  to  produce  a  vi¬ 
sual  image  for  decision  making.  The  de¬ 
terministic  critical  path  is  the  critical  path 
that  is  determined  without  the  consider¬ 
ation  of  risk;  it  is  the  collection  of  tasks 
that  determine  the  total  completion  time 
of  the  project.  Due  to  risk,  there  is  a  prob¬ 
ability  that  some  other  tasks  not  on  the 
deterministic  critical  path  may  increase 
in  duration  so  that  they  increase  the  total 
completion  time  of  the  project.  The  risks 
put  those  tasks  on  a  probabilistic  critical 
path  with  an  associated  probability. 

In  the  absence  of  the  risk  analysis, 
the  deterministic  critical  path  will  be 
the  one  to  which  a  project  manager  will 
place  the  maximum  amount  of  atten¬ 
tion  and,  hence,  management  re¬ 
sources.  The  stochastic  critical  paths 
are  all  of  the  other  critical  paths  that 


At-Risk 

Tasks-4 


At-Risk 

Tasks-5 


Figure  9.  The  Stochastic  Criticai  Path  Chart 
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may  be  critical  depending  on  risk  and, 
hence  without  the  analysis,  may  never 
catch  the  attention  of  the  project  man¬ 
ager.  Being  dependent  on  the  outcome 
of  a  risk,  they  are  “critical”  with  some 
probability. 

How  is  it  created?  The  stochastic 
critical  path  is  a  result  of  the  Monte  Carlo 
schedule  network  solution  illustrated  in 
Figure  3.  As  the  network  is  analysed  and 
sampled  via  the  Monte  Carlo  analyser,  the 
critical  path  is  recorded  for  each  iteration 
and  the  software  captures  the  frequency 
that  a  task  is  on  the  critical  path.  Figure  9 
illustrates  the  result. 

Imagine  the  diagram  to  be  a  fishbone. 
Let  the  spine  of  the  fishbone  be  a  repre¬ 
sentation  of  the  deterministic  critical  path. 
Greatly  simplify  the  activities  on  the  spine 
collecting  activities  into  groups  such  that 


they  are  single  but  no  more  than  two 
collectives  between  the  bones  that  come 
into  the  spine.  The  bones  that  come  into 
the  spine  are  the  alternative  stochastic 
critical  paths.  Color,  or  shade  the  bones, 
(i.e.,  stochastic  critical  path  activities) 
to  correspond  to  a  legend  that  specifies 
probability  of  being  on  the  critical  path. 
You  need  to  know  that  there  are  other 
terms  that  are  used  by  commercial  soft¬ 
ware  analysis  packages  to  describe  the 
probability  of  being  on  the  critical  path, 
those  are  (1)  criticality  and  (2)  diver¬ 
sity. 

Figure  9  is  a  simplified  Stochastic  Criti¬ 
cal  Path  diagram  used  to  illustrate  the  fun¬ 
damental  features  and  Figure  10  is  a  dia¬ 
gram  from  an  actual  project. 

Utility.  The  project  manager  now  has 
a  quantitative  representation  as  to  where 
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Figure  10.  A  Simplified  Version  of  the  Stochastic  Critical  Path 
Developed  for  an  Actual  Proiect 
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risk-management  resources  need  to  be 
invested.  For  example  in  Figure  9,  at- 
risk  tasks  4  and  5  should  be  ignored, 
even  if  they  are  high  risk  items,  they 
just  can’t  “catch  up”  with  the  critical 
path  because  it  is  driven  so  hard  by  at- 
risk  tasks  1  and  2.  Also  note  that  with¬ 
out  the  stochastic  critical  path  (CP),  the 
project  manager  will  be  managing  to  the 
deterministic  CP  and  will  be  focused  on 
the  wrong  things.  It  also  tells  the  pro¬ 
gram  manager  that  dollars  to  mitigate 
risk  in  at-risk  task  2  have  twice  (or  2.23 
times)  the  value  of  dollars  used  to  miti¬ 
gate  risk  in  at-risk  task  1. 

An  Actual  Case  Example.  Figure  10 
is  the  stochastic  critical  path  for  a  NASA 
space  exploration  mission.  The  actual 
names  of  the  tasks  have  been  replaced  with 
generic  names  in  some  cases  and  a  few 
other  simplifications  were  made  to  get  the 
graphic  to  fit  in  this  paper.  First  note  that 
the  Experiment  Package  was  a  high-risk 
item  but  never  appeared  on  the  critical  path 
because  the  others  drove  it  so  hard.  Thus, 
the  Experiment  Package  could  be  put  on 
the  watch  list.  There  are  three  other  proba¬ 
bilistic  critical  paths  that  are  competing 
with  each  other  almost  equally;  being  50 
percent  probable.  Also,  unobservable  here 
because  of  the  simplifications,  there  are 


many  tasks  on  the  deterministic  critical 
path  prior  to  the  task  labelled  SC-4  that 
were  never  on  the  critical  path  when  risk 
is  considered,  thus  the  project  could  relax 
its  vigil  there  as  well.  Also  note  the  dom¬ 
inance  of  the  Plight  Software  (PSW) 
packages  on  the  various  stochastic  criti¬ 
cal  paths  thus  providing  a  primary  focus 
for  risk  mitigation.  It’s  also  important  to 
note  that  schedule  must  be  something  the 
program  wants  to  risk-manage  if  this  is  to 
be  useful. 

Part  2:  Product-Focused  Tools 
AND  Techniques 

Bandaid  Charts  and 
Importance  Value  Charts 

What  are  they?  There  are  several  use¬ 
ful  risk-based  decision  support  products 
that  are  extractable  from  a  probabilistic 
risk  assessment  (PRA)  of  the  product  be¬ 
ing  developed  and  delivered  by  the  project. 
Of  those  products,  the  ones  seeming  to 
have  the  greatest  utility  are  the  Bandaid 
Charts  and  the  Importance  Value  Charts. 
The  Bandaid  Charts  are  named  for  their 
appearance  in  that  they  look  very  similar 
to  bandages  produced  by  brand-name 
companies  such  as  Band-Aid.  They  are  a 


Figure  11.  Sample  Bandaid  Chart 
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spread  of  probable  outcomes  that  ap¬ 
pear  on  the  chart  as  a  “band”  of  values 
with  one  end  being  the  5th  percentile, 
the  other  end  the  95th  percentile  and 
the  center  is  marked  or  densified  to  in¬ 
dicate  the  central  tendency,  hence  the 
Band-Aid  appearance.  One  of  these  is 
produced  for  each  end  state  of  interest. 
A  sample  Bandaid  Chart  is  shown  in 
Figure  1 1 . 

The  Importance  Value  Charts  are  de¬ 
rived  from  a  probabilistic  risk  assessment 
technique.  They  are  a  result  of  metrics  col¬ 
lected  as  the  PRA  is  performed  that  reflect 
the  contribution  of  selected  systems  or 
components  to  the  overall  failure  rate.  They 
are  often  normalized  to  some  specific  pa¬ 
rameter  of  the  decision  to  avoid  the  afore¬ 
mentioned  problem  of  everyone  focusing 
on  the  failure  rates  rather  than  the  decision 
information.  The  Importance  Value  Chart 
is  a  bar  graph  with  each  bar  representing  a 
specific  system,  subsystem,  or  component’s 
contribution  to  the  overall  probability  of 


an  undesirable  event.  They  are  arranged 
in  order  with  the  “tallest”  bar  on  the  left 
and  subsequently  shorter  bars  progress¬ 
ing  to  the  right.  Figure  12  is  a  typical 
example. 

To  support  a  project’s  decisions,  one 
may  mark  the  point  where  80  percent 
of  the  total  loss  rate  is  accumulated  by 
the  systems.  This  quickly  draws  the 
decision  maker’s  attention  to  those  few 
items  that  need  to  be  subjected  to  de¬ 
sign  improvement  or  additional  testing 
and  verification. 

How  are  they  created?  Both  the 
Bandaid  Charts  and  the  Importance  Value 
Charts  are  outputs  from  post-analysis  of 
data  from  a  probabilistic  risk  assessment. 
PRAs  can  be  created  in  two  ways:  bot- 
tom-up  and  top-down.  The  bottom-up 
approach  can  be  very  costly  in  that  many 
components  are  analyzed  and  modeled  to 
a  level  of  detail  that  may  not  affect  the 
end  result.  Working  from  the  top-down 


Figure  12.  Imperlance  Value  Chart 
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has,  so  far  in  our  experience,  been  able 
to  produce  the  decision-support  infor¬ 
mation  at  a  fraction  of  the  cost  of  a  bot- 
tom-up.  Of  course,  if  the  project  can 
afford  the  bottom-up  analysis,  then  it 
would  be  more  thorough  and  probably 
the  best  approach. 

Performing  the  top-down  analysis,  one 
must  first  determine,  then  model  the  most 
undesirable  outcomes,  such  as  Loss  of 
Crew,  Loss  of  Vehicle,  if  one  were  ana¬ 
lyzing  a  space  program  with  crewpersons 
present;  or  it  could  be  Loss  of  Science  or 
Reduced  Science  Quality  if  one  were  ana¬ 
lyzing  a  robotic  space  exploration  mis¬ 
sion;  or  it  could  be  Lethality  or  Surviv¬ 
ability  if  one  were  analyzing  a  Depart¬ 
ment  of  Defense  (DoD)  weapon  system 
or  system  of  systems. 

Whatever  those  undesirable  outcomes, 
there  will  exist  logical  scenarios  triggered 
by  initiating  events.  Those  scenarios  will 


contain  response  actions  based  on  the  rel¬ 
evant  initiating  event.  The  Master  Logic 
Diagram  (MLD)  is  used  to  identify  the 
initiating  events  and  to  put  them  in  con¬ 
text  with  each  other.  The  MLD  is  a  top- 
down  logical  representation  of  the  sys¬ 
tem  (see  Figure  13). 

Then  for  each  element  in  the  MLD,  one 
would  build  event  sequences  that  could 
cause  the  scenario  to  be  executed.  The 
event  sequence  diagrams  start  with  an  ini¬ 
tiator  event  and  end  with  many  end  states. 
The  events  ask  about  redundancy,  repair, 
operational  workarounds,  other  conse¬ 
quences,  and  responses  of  the  system. 
Note  that  these  scenarios  are  developed 
“given  that  the  initiator  has  occurred.”  For 
each  relevant  element  in  the  event  se¬ 
quence,  Fault  Trees  are  constructed  that 
describe  the  failure  events  in  the  systems, 
subsystems,  or  components  in  the 
product. 


Event 


Failure  History  Data 

The  PRA  approach  is  most  effective  and  efficient  when  scenario  driven,  modeled  from  the  top-down,  and 
decomposed  no  lower  than  the  level  that  affects  the  decision  information. 


Figure  1 3.  The  Master  Logic  Diagram  (MLD) 


170 


Risk-Based  Desision  Support  Terhniques  for  Programs  and  Projerts 


Only  those  fault  trees  that  are  related 
to  the  undesirable  outcome  are  modeled. 
In  addition,  the  model  is  scenario  driven 
to  account  for  all  system  and  operator 
intermediate  actions  that  result  from  the 
initiating  event. 

At  the  very  bottom  of  the  model  is 
the  database  that  feeds  the  fault  trees. 
Another  cost-saving  exercise  is  em¬ 
ployed  here  in  that  one  could  stop  at 
the  system  or  subsystem  level  and  not 
need  to  punch  all  the  way  down  to  a 
component  level  should  the  system  or 
subsystem  not  prove  to  be  a  significant 
driver.  This  can  be  done  because  of  the 
approach  of  modeling  the  elements  of 
the  model  with  probability  density  func¬ 
tions.  For  example,  a  power  system  of 
a  specific  type  can  be  modeled  by  all 
analogous  power  systems  available  from 
previous  projects.  This  will,  of  course, 
result  in  a  broad  probability  density 
function  because  of  the  variables  that 


are  present  at  the  next  level  of  decom¬ 
position.  However,  if  it  makes  no  appre¬ 
ciable  difference  in  the  decision-support 
information,  why  bother  with  further 
decomposition? 

There  exists  a  good  suite  of  PRA  tools 
such  as  QRAS,  SAPHIRE,  or  Monte  Carlo 
simulators  that  are  add-ons  to  spread¬ 
sheets:  for  example.  Palisade  Software’s 
@Risk  for  Excel  or  PrecisionTree. 

Utility  of  the  Bandaid  Chart.  It  is  im¬ 
portant  to  note  here  that  the  output  of  these 
types  of  analyses  are  probabilities  of  fail¬ 
ure,  a  specific  number  that  often  attracts 
so  much  attention  that  the  utility  of  the 
analysis  is  lost.  Of  course,  the  number  can 
be  unacceptably  low  requiring  a  signifi¬ 
cant  redesign.  But  once  we  get  beyond 
that  point,  it  is  better  to  try  to  understand 
what  the  model  is  telling  us  about  weak¬ 
nesses  in  our  system,  rather  than  focus 
on  the  number.  So  many  times  have  we 
all  been  engaged  in  heated  arguments  and 


The  Bandaid  Charts  are  useful  in  decision  making  for  selection  of  options  when  the  input  information  is  fraught 
with  uncertainty. 

Figure  14.  The  Bandaid  Charts 
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prolonged  discussions  about  the  specific 
value  of  the  probability  of  failure  and, 
probably  in  some  cases,  to  cover  up  the 
number,  discredit  it,  or  change  the  inputs 
to  produce  a  more  favorable  outcome  to 
present  to  the  stakeholders.  Hence,  one 
should  attempt  to  normalize  the  results  to 
some  specific  baseline  to  avoid  these  dis¬ 
cussions  and  assist  the  project  manager 
in  making  good  decisions. 

The  real  utility  in  the  Bandaid  charts  is 
decision  making  about  options.  A  sample 
Bandaid  Chart  is  shown  in  Figure  14.  In 
any  PRA  analysis  the  actual  values  for 
the  failure  rates  of  the  systems  and  sub¬ 
systems  in  the  fault  tree  are  seldom  known 
as  precise  point-values  but  more  often  are 
probability  distributions  that  represent  the 
uncertainty  of  the  information  available 
for  a  system  or  subsystem.  Hence,  the 


result  of  the  analysis  will  also  be  an  un¬ 
certain  number. 

In  Figure  14,  Option  C  is  clearly  better 
than  Option  B,  having  a  lower  probabil¬ 
ity  of  occurrence  over  its  entire  uncer¬ 
tainty  region  (Statistically  Dominant). 
Option  A  and  Option  C  overlap,  thus 
Option  C  is  not  Statistically  Dominant  over 
Option  A.  There  exist  a  fair  number  of 
possible  outcomes  of  Option  A  that  could 
be  better  than  Option  C.  One  might  still 
want  to  select  Option  C  because  of  its 
much  lower  range  of  uncertainty  that  will 
provide  a  more  stable  planning  environ¬ 
ment. 

An  Actual  Case  Example.  In  this  ac¬ 
tual  case  (Figure  15),  a  new  upgrade  sys¬ 
tem  was  proposed  for  a  NASA  launch 
vehicle.  The  upgrade  system  actually 
could  demonstrate  that  when  the  median 


Relative  Risk  Comparisons  for  Loss  of  Vehicle 


The  Bandaid  chart  was  produced  demonstrating  that  the  uncertainty  of  the  information  indicated  that 
the  upgrade  has  some  probabiiity  of  being  worse  than  the  existing  system.  The  overiapping  probabiiity 
densities  (not  shown  here)  equated  to  a  15  percent  chance  of  being  worse  than  the  existing  system. 


Figure  1 5.  Actual  Bandaid  Chart  Example 


172 


Risk-Based  Desision  Support  Terhniques  for  Programs  and  Projerts 


value  for  the  probability  for  loss  of  vehicle 
was  compared  with  the  existing  design, 
there  was  a  notable  improvement  on  the 
order  of  65  percent  reduction  in  loss  of 
vehicle.  However,  when  the  analysis  was 
done  considering  the  uncertainty  in  the 
information,  there  was  significant  over¬ 
lapping  of  the  two  probability  density 
functions.  Calculations  showed  that  there 
was  a  15  percent  chance  that  the  upgrade 
would  actually  be  worse  in  contribution 
to  loss  of  vehicle  than  the  existing  design. 
Considering  the  investment  costs  versus 
the  risk  that  the  upgrade  may  perform 
worse  than  the  existing  design  led  to  the 
decision  to  retain  the  existing  design. 

Importance  Values 

utility.  The  best  way  to  illustrate  the 
utility  is  to  imagine  the  case  that  a  PRA 
has  been  completed  and  you  can  present 


to  the  decision  maker  either  (1)  “The 
project  has  a  73  percent  chance  of  success” 
or  (2)  “These  three  systems  contribute  80 
percent  of  the  threat  of  loss.”  Both  answers 
have  use  to  the  decision  maker,  but  an¬ 
swer  (2)  provides  much  more  opportunity 
to  make  effective  and  efficient  decisions 
for  improvement. 

Project  decision  makers  use  the  Im¬ 
portance  Value  Charts  to  refocus  early 
design  activities  as  well  as  midcourse 
corrections  as  the  design  matures.  It  also 
permits  the  planning  for  test  and  verifi¬ 
cation  to  focus  on  systems  that  are 
threatening  to  success.  It  should  also  be 
noted  that  these  threats  are  strongly  de¬ 
pendent  on  how  the  product  is  operated. 
This  assists  in  the  operational  planning 
or  the  design  of  support  systems  in  that 
the  operational  scenarios  can  be  de¬ 
signed  to  focus  on  ways  to  desensitize 


The  Importance  Value  Chart  helps  the  project  to  focus  design,  test,  and  operations  planning  resources  on 
those  few  systems,  subsystems,  or  components  that  are  contributing  the  most  to  the  failure  to  achieve  an 
objective. 


Figure  1 6.  The  Imperlance  Value  Chart 
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the  impact  of  failures  in  these  systems 
or  operate  them  differently  to  reduce 
stress  and  hence  failure  rate. 

Actual  Case  Example.  The  project 
chosen  for  this  case  example  was  a  rohotic 
space  exploration  mission.  The  PRA  was 
performed  to  support  decisions  to  he  made 
at  the  project’s  Critical  Design  Review. 
One  Importance  Value  Chart  is  presented 
in  Figure  16.  In  this  chart,  the  pyrotech¬ 
nics  used  to  separate  the  solar  arrays 
and  permit  them  to  he  deployed  after 
launch  were  the  primary  drivers  along 
with  a  thruster  failure  being  a  minor  con¬ 
tributor. 

In  this  specific  project,  the  decision 
makers  were  surprised  that  the  pyrotech¬ 
nic  devices  were  the  major  driver  because 
of  the  exceptional  high  reliability  of  the 
devices.  Indeed,  they  are  high  reliability, 
but  they  were  one  of  the  very  few  critical 
items  that  were  single -point  failures.  It 
was  not  that  they  were  “low”  reliability, 
but  they  were  less  reliable  than  all  the 
other  subsystems  most  of  which  were 
functionally  redundant.  Knowing  that 
there  exists  a  distribution  of  reliability 
data  in  the  estimates,  they  took  actions 
to  assure  that  they  were  getting  the  best 
of  the  lots;  actions  such  as  increasing 


the  quality  assurance  measures  on  the 
pyrotechnics,  independent  inspections, 
additional  testing,  etc.  Similar  charts 
were  used  to  define  the  operational  test 
and  simulation  procedures,  to  inject  fail¬ 
ures  into  the  simulations  that  repre¬ 
sented  the  “tail-pole”  failures  identified 
by  the  Importance  Values,  and  to  de¬ 
velop  contingency  operational  proce¬ 
dures  should  they  occur. 

Closing  Comments 


We  recommend  that  projects  consider 
the  value  and  utility  of  these  products  and 
implement  them  where  appropriate. 
Should  the  project  manager  worry  about 
affordability  of  this  type  of  analysis,  it  is 
best  to  remember  that  actions  taken 
based  on  these  analyses  avoided  expen¬ 
ditures  that  exceeded  the  cost  by  a  fac¬ 
tor  of  20.  The  simple  act  of  directing 
the  mission  design  to  be  done  to  sup¬ 
port  the  second  launch  opportunity,  as 
shown  in  Figure  4,  rather  than  accord¬ 
ing  to  the  original  plan,  saved  the  project 
more  money  than  the  cost  of  the  analy¬ 
sis  for  the  entire  project  life  cycle. 
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